Goto

Collaborating Authors

 unsupervised random forest


Reviews: When do random forests fail?

Neural Information Processing Systems

This is a fairly serious omission, and casual readers would remember the wrong conclusions. This must be fixed for publication, but I think it would be straightforward to fix. Officially, NIPS reviewers are not required to look at the supplementary material. Because of having only three weeks to review six manuscripts, I was not able to make the time during my reviewing. So I worry that publishing this work would mean publishing results without sufficient peer review. DETAILED COMMENTS * p. 1: I'm not sure it is accurate to say that deep, unsupervised trees grown with no subsampling is a common setup for learning random forests. It appears in Geurts et al. (2006) as a special case, sometimes in mass estimation [1, 2], and sometimes in Wei Fan's random decision tree papers [3-6]. I don't think these are used very much.


Federated unsupervised random forest for privacy-preserving patient stratification

Pfeifer, Bastian, Sirocchi, Christel, Bloice, Marcus D., Kreuzthaler, Markus, Urschler, Martin

arXiv.org Artificial Intelligence

In the realm of precision medicine, effective patient stratification and disease subtyping demand innovative methodologies tailored for multi-omics data. Clustering techniques applied to multi-omics data have become instrumental in identifying distinct subgroups of patients, enabling a finer-grained understanding of disease variability. This work establishes a powerful framework for advancing precision medicine through unsupervised random-forest-based clustering and federated computing. We introduce a novel multi-omics clustering approach utilizing unsupervised random-forests. The unsupervised nature of the random forest enables the determination of cluster-specific feature importance, unraveling key molecular contributors to distinct patient groups. Moreover, our methodology is designed for federated execution, a crucial aspect in the medical domain where privacy concerns are paramount. We have validated our approach on machine learning benchmark data sets as well as on cancer data from The Cancer Genome Atlas (TCGA). Our method is competitive with the state-of-the-art in terms of disease subtyping, but at the same time substantially improves the cluster interpretability. Experiments indicate that local clustering performance can be improved through federated computing.